56 research outputs found

    Comparative analysis of thermophilic and mesophilic proteins using Protein Energy Networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Thermophilic proteins sustain themselves and function at higher temperatures. Despite their structural and functional similarities with their mesophilic homologues, they show enhanced stability. Various comparative studies at genomic, protein sequence and structure levels, and experimental works highlight the different factors and dominant interacting forces contributing to this increased stability.</p> <p>Methods</p> <p>In this comparative structure based study, we have used interaction energies between amino acids, to generate structure networks called as Protein Energy Networks (PENs). These PENs are used to compute network, sub-graph, and node specific parameters. These parameters are then compared between the thermophile-mesophile homologues.</p> <p>Results</p> <p>The results show an increased number of clusters and low energy cliques in thermophiles as the main contributing factors for their enhanced stability. Further more, we see an increase in the number of hubs in thermophiles. We also observe no community of electrostatic cliques forming in PENs.</p> <p>Conclusion</p> <p>In this study we were able to take an energy based network approach, to identify the factors responsible for enhanced stability of thermophiles, by comparative analysis. We were able to point out that the sub-graph parameters are the prominent contributing factors. The thermophiles have a better-packed hydrophobic core. We have also discussed how thermophiles, although increasing stability through higher connectivity retains conformational flexibility, from a cliques and communities perspective.</p

    Computational Prediction of Heme-Binding Residues by Exploiting Residue Interaction Network

    Get PDF
    Computational identification of heme-binding residues is beneficial for predicting and designing novel heme proteins. Here we proposed a novel method for heme-binding residue prediction by exploiting topological properties of these residues in the residue interaction networks derived from three-dimensional structures. Comprehensive analysis showed that key residues located in heme-binding regions are generally associated with the nodes with higher degree, closeness and betweenness, but lower clustering coefficient in the network. HemeNet, a support vector machine (SVM) based predictor, was developed to identify heme-binding residues by combining topological features with existing sequence and structural features. The results showed that incorporation of network-based features significantly improved the prediction performance. We also compared the residue interaction networks of heme proteins before and after heme binding and found that the topological features can well characterize the heme-binding sites of apo structures as well as those of holo structures, which led to reliable performance improvement as we applied HemeNet to predicting the binding residues of proteins in the heme-free state. HemeNet web server is freely accessible at http://mleg.cse.sc.edu/hemeNet/

    Identification of B Cell Epitopes of Alcohol Dehydrogenase Allergen of Curvularia lunata

    Get PDF
    BACKGROUND/OBJECTIVE: Epitope identification assists in developing molecules for clinical applications and is useful in defining molecular features of allergens for understanding structure/function relationship. The present study was aimed to identify the B cell epitopes of alcohol dehydrogenase (ADH) allergen from Curvularia lunata using in-silico methods and immunoassay. METHOD: B cell epitopes of ADH were predicted by sequence and structure based methods and protein-protein interaction tools while T cell epitopes by inhibitory concentration and binding score methods. The epitopes were superimposed on a three dimensional model of ADH generated by homology modeling and analyzed for antigenic characteristics. Peptides corresponding to predicted epitopes were synthesized and immunoreactivity assessed by ELISA using individual and pooled patients' sera. RESULT: The homology model showed GroES like catalytic domain joined to Rossmann superfamily domain by an alpha helix. Stereochemical quality was confirmed by Procheck which showed 90% residues in most favorable region of Ramachandran plot while Errat gave a quality score of 92.733%. Six B cell (P1-P6) and four T cell (P7-P10) epitopes were predicted by a combination of methods. Peptide P2 (epitope P2) showed E(X)(2)GGP(X)(3)KKI conserved pattern among allergens of pathogenesis related family. It was predicted as high affinity binder based on electronegativity and low hydrophobicity. The computational methods employed were validated using Bet v 1 and Der p 2 allergens where 67% and 60% of the epitope residues were predicted correctly. Among B cell epitopes, Peptide P2 showed maximum IgE binding with individual and pooled patients' sera (mean OD 0.604±0.059 and 0.506±0.0035, respectively) followed by P1, P4 and P3 epitopes. All T cell epitopes showed lower IgE binding. CONCLUSION: Four B cell epitopes of C. lunata ADH were identified. Peptide P2 can serve as a potential candidate for diagnosis of allergic diseases

    How accurate and statistically robust are catalytic site predictions based on closeness centrality?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex <it>i </it>and all other vertices.</p> <p>Results</p> <p>We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined.</p> <p>Conclusion</p> <p>Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation.</p

    Modeling allosteric signal propagation using protein structure networks

    Get PDF
    Allosteric communication in proteins can be induced by the binding of effective ligands, mutations or covalent modifications that regulate a site distant from the perturbed region. To understand allosteric regulation, it is important to identify the remote sites that are affected by the perturbation-induced signals and how these allosteric perturbations are transmitted within the protein structure. In this study, by constructing a protein structure network and modeling signal transmission with a Markov random walk, we developed a method to estimate the signal propagation and the resulting effects. In our model, the global perturbation effects from a particular signal initiation site were estimated by calculating the expected visiting time (EVT), which describes the signal-induced effects caused by signal transmission through all possible routes. We hypothesized that the residues with high EVT values play important roles in allosteric signaling. We applied our model to two protein structures as examples, and verified the validity of our model using various types of experimental data. We also found that the hot spots in protein binding interfaces have significantly high EVT values, which suggests that they play roles in mediating signal communication between protein domains

    VASCo: computation and visualization of annotated protein surface contacts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Structural data from crystallographic analyses contain a vast amount of information on protein-protein contacts. Knowledge on protein-protein interactions is essential for understanding many processes in living cells. The methods to investigate these interactions range from genetics to biophysics, crystallography, bioinformatics and computer modeling. Also crystal contact information can be useful to understand biologically relevant protein oligomerisation as they rely in principle on the same physico-chemical interaction forces. Visualization of crystal and biological contact data including different surface properties can help to analyse protein-protein interactions.</p> <p>Results</p> <p>VASCo is a program package for the calculation of protein surface properties and the visualization of annotated surfaces. Special emphasis is laid on protein-protein interactions, which are calculated based on surface point distances. The same approach is used to compare surfaces of two aligned molecules. Molecular properties such as electrostatic potential or hydrophobicity are mapped onto these surface points. Molecular surfaces and the corresponding properties are calculated using well established programs integrated into the package, as well as using custom developed programs. The modular package can easily be extended to include new properties for annotation. The output of the program is most conveniently displayed in PyMOL using a custom-made plug-in.</p> <p>Conclusion</p> <p>VASCo supplements other available protein contact visualisation tools and provides additional information on biological interactions as well as on crystal contacts. The tool provides a unique feature to compare surfaces of two aligned molecules based on point distances and thereby facilitates the visualization and analysis of surface differences.</p

    A Search for Energy Minimized Sequences of Proteins

    Get PDF
    In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10-7. In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function

    Dr. PIAS: an integrative system for assessing the druggability of protein-protein interactions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The amount of data on protein-protein interactions (PPIs) available in public databases and in the literature has rapidly expanded in recent years. PPI data can provide useful information for researchers in pharmacology and medicine as well as those in interactome studies. There is urgent need for a novel methodology or software allowing the efficient utilization of PPI data in pharmacology and medicine.</p> <p>Results</p> <p>To address this need, we have developed the 'Druggable Protein-protein Interaction Assessment System' (Dr. PIAS). Dr. PIAS has a meta-database that stores various types of information (tertiary structures, drugs/chemicals, and biological functions associated with PPIs) retrieved from public sources. By integrating this information, Dr. PIAS assesses whether a PPI is druggable as a target for small chemical ligands by using a supervised machine-learning method, support vector machine (SVM). Dr. PIAS holds not only known druggable PPIs but also all PPIs of human, mouse, rat, and human immunodeficiency virus (HIV) proteins identified to date.</p> <p>Conclusions</p> <p>The design concept of Dr. PIAS is distinct from other published PPI databases in that it focuses on selecting the PPIs most likely to make good drug targets, rather than merely collecting PPI data.</p

    False positive reduction in protein-protein interaction predictions using gene ontology annotations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated.</p> <p>Results</p> <p>Gene Ontology (GO) annotations were used to reduce false positive protein-protein interactions (PPI) pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets) in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The '<it>strength</it>', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the <it>strength </it>varies between two and ten-fold of randomly removing protein pairs from the datasets.</p> <p>Conclusion</p> <p>Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially remove false predicted PPI pairs. Removal of false positives from predicted datasets increases the true positive fractions of the datasets and improves the robustness of predicted pairs as compared to random protein pairing, and eventually results in better overlap with experimental results.</p

    Novel Feature for Catalytic Protein Residues Reflecting Interactions with Other Residues

    Get PDF
    Owing to their potential for systematic analysis, complex networks have been widely used in proteomics. Representing a protein structure as a topology network provides novel insight into understanding protein folding mechanisms, stability and function. Here, we develop a new feature to reveal correlations between residues using a protein structure network. In an original attempt to quantify the effects of several key residues on catalytic residues, a power function was used to model interactions between residues. The results indicate that focusing on a few residues is a feasible approach to identifying catalytic residues. The spatial environment surrounding a catalytic residue was analyzed in a layered manner. We present evidence that correlation between residues is related to their distance apart most environmental parameters of the outer layer make a smaller contribution to prediction and ii catalytic residues tend to be located near key positions in enzyme folds. Feature analysis revealed satisfactory performance for our features, which were combined with several conventional features in a prediction model for catalytic residues using a comprehensive data set from the Catalytic Site Atlas. Values of 88.6 for sensitivity and 88.4 for specificity were obtained by 10fold crossvalidation. These results suggest that these features reveal the mutual dependence of residues and are promising for further study of structurefunction relationship
    corecore